15 research outputs found
Punny Captions: Witty Wordplay in Image Descriptions
Wit is a form of rich interaction that is often grounded in a specific
situation (e.g., a comment in response to an event). In this work, we attempt
to build computational models that can produce witty descriptions for a given
image. Inspired by a cognitive account of humor appreciation, we employ
linguistic wordplay, specifically puns, in image descriptions. We develop two
approaches which involve retrieving witty descriptions for a given image from a
large corpus of sentences, or generating them via an encoder-decoder neural
network architecture. We compare our approach against meaningful baseline
approaches via human studies and show substantial improvements. We find that
when a human is subject to similar constraints as the model regarding word
usage and style, people vote the image descriptions generated by our model to
be slightly wittier than human-written witty descriptions. Unsurprisingly,
humans are almost always wittier than the model when they are free to choose
the vocabulary, style, etc.Comment: NAACL 2018 (11 pages
Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings
Generalizing deep neural networks to new target domains is critical to their
real-world utility. In practice, it may be feasible to get some target data
labeled, but to be cost-effective it is desirable to select a
maximally-informative subset via active learning (AL). We study the problem of
AL under a domain shift, called Active Domain Adaptation (Active DA). We
empirically demonstrate how existing AL approaches based solely on model
uncertainty or diversity sampling are suboptimal for Active DA. Our algorithm,
Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings
(ADA-CLUE), i) identifies target instances for labeling that are both uncertain
under the model and diverse in feature space, and ii) leverages the available
source and target data for adaptation by optimizing a semi-supervised
adversarial entropy loss that is complementary to our active sampling
objective. On standard image classification-based domain adaptation benchmarks,
ADA-CLUE consistently outperforms competing active adaptation, active learning,
and domain adaptation methods across domain shifts of varying severity
Evaluating Visual Conversational Agents via Cooperative Human-AI Games
As AI continues to advance, human-AI teams are inevitable. However, progress
in AI is routinely measured in isolation, without a human in the loop. It is
crucial to benchmark progress in AI, not just in isolation, but also in terms
of how it translates to helping humans perform certain tasks, i.e., the
performance of human-AI teams.
In this work, we design a cooperative game - GuessWhich - to measure human-AI
team performance in the specific context of the AI being a visual
conversational agent. GuessWhich involves live interaction between the human
and the AI. The AI, which we call ALICE, is provided an image which is unseen
by the human. Following a brief description of the image, the human questions
ALICE about this secret image to identify it from a fixed pool of images.
We measure performance of the human-ALICE team by the number of guesses it
takes the human to correctly identify the secret image after a fixed number of
dialog rounds with ALICE. We compare performance of the human-ALICE teams for
two versions of ALICE. Our human studies suggest a counterintuitive trend -
that while AI literature shows that one version outperforms the other when
paired with an AI questioner bot, we find that this improvement in AI-AI
performance does not translate to improved human-AI performance. This suggests
a mismatch between benchmarking of AI in isolation and in the context of
human-AI teams.Comment: HCOMP 201
Prognosis Following Surgery for Recurrent Ovarian Cancer and Diagnostic Criteria Predictive of Cytoreduction Success: A Systematic Review and Meta-Analysis
For women achieving clinical remission after the completion of initial treatment for epithelial ovarian cancer, 80% with advanced-stage disease will develop recurrence. However, the standard treatment of women with recurrent platinum-sensitive diseases remains poorly defined. Secondary (SCS), tertiary (TCS) or quaternary (QCS) cytoreduction surgery for recurrence has been suggested to be associated with increased overall survival (OS). We searched five databases for studies reporting death rate, OS, cytoreduction rates, post-operative morbidity/mortality and diagnostic models predicting complete cytoreduction in a platinum-sensitive disease recurrence setting. Death rates calculated from raw data were pooled based on a random-effects model. Meta-regression/linear regression was performed to explore the role of complete or optimal cytoreduction as a moderator. Pooled death rates were 45%, 51%, 66% for SCS, TCS and QCS, respectively. Median OS for optimal cytoreduction ranged from 16–91, 24–99 and 39–135 months for SCS, TCS and QCS, respectively. Every 10% increase in complete cytoreduction rates at SCS corresponds to a 7% increase in median OS. Complete cytoreduction rates ranged from 9–100%, 35–90% and 33–100% for SCS, TCS and QCS, respectively. Major post-operative thirty-day morbidity was reported to range from 0–47%, 13–33% and 15–29% for SCS, TCS and QCS, respectively. Thirty-day post-operative mortality was 0–6%, 0–3% and 0–2% for SCS, TCS and QCS, respectively. There were two externally validated diagnostic models predicting complete cytoreduction at SCS, but none for TCS and QCS. In conclusion, our data confirm that maximal effort higher order cytoreductive surgery resulting in complete cytoreduction can improve survival
Implementation of Multigene Germline and Parallel Somatic Genetic Testing in Epithelial Ovarian Cancer: SIGNPOST Study
We present findings of a cancer multidisciplinary-team (MDT) coordinated mainstreaming pathway of unselected 5-panel germline BRCA1/BRCA2/RAD51C/RAD51D/BRIP1 and parallel somatic BRCA1/BRCA2 testing in all women with epithelial-OC and highlight the discordance between germline and somatic testing strategies across two cancer centres. Patients were counselled and consented by a cancer MDT member. The uptake of parallel multi-gene germline and somatic testing was 97.7%. Counselling by clinical-nurse-specialist more frequently needed >1 consultation (53.6% (30/56)) compared to a medical (15.0% (21/137)) or surgical oncologist (15.3% (17/110)) (p 0.001). The median age was 54 (IQR = 51–62) years in germline pathogenic-variant (PV) versus 61 (IQR = 51–71) in BRCA wild-type (p = 0.001). There was no significant difference in distribution of PVs by ethnicity, stage, surgery timing or resection status. A total of 15.5% germline and 7.8% somatic BRCA1/BRCA2 PVs were identified. A total of 2.3% patients had RAD51C/RAD51D/BRIP1 PVs. A total of 11% germline PVs were large-genomic-rearrangements and missed by somatic testing. A total of 20% germline PVs are missed by somatic first BRCA-testing approach and 55.6% germline PVs missed by family history ascertainment. The somatic testing failure rate is higher (23%) for patients undergoing diagnostic biopsies. Our findings favour a prospective parallel somatic and germline panel testing approach as a clinically efficient strategy to maximise variant identification. UK Genomics test-directory criteria should be expanded to include a panel of OC genes.Peer reviewe
Prognosis Following Surgery for Recurrent Ovarian Cancer and Diagnostic Criteria Predictive of Cytoreduction Success: A Systematic Review and Meta-Analysis
For women achieving clinical remission after the completion of initial treatment for epithelial ovarian cancer, 80% with advanced-stage disease will develop recurrence. However, the standard treatment of women with recurrent platinum-sensitive diseases remains poorly defined. Secondary (SCS), tertiary (TCS) or quaternary (QCS) cytoreduction surgery for recurrence has been suggested to be associated with increased overall survival (OS). We searched five databases for studies reporting death rate, OS, cytoreduction rates, post-operative morbidity/mortality and diagnostic models predicting complete cytoreduction in a platinum-sensitive disease recurrence setting. Death rates calculated from raw data were pooled based on a random-effects model. Meta-regression/linear regression was performed to explore the role of complete or optimal cytoreduction as a moderator. Pooled death rates were 45%, 51%, 66% for SCS, TCS and QCS, respectively. Median OS for optimal cytoreduction ranged from 16–91, 24–99 and 39–135 months for SCS, TCS and QCS, respectively. Every 10% increase in complete cytoreduction rates at SCS corresponds to a 7% increase in median OS. Complete cytoreduction rates ranged from 9–100%, 35–90% and 33–100% for SCS, TCS and QCS, respectively. Major post-operative thirty-day morbidity was reported to range from 0–47%, 13–33% and 15–29% for SCS, TCS and QCS, respectively. Thirty-day post-operative mortality was 0–6%, 0–3% and 0–2% for SCS, TCS and QCS, respectively. There were two externally validated diagnostic models predicting complete cytoreduction at SCS, but none for TCS and QCS. In conclusion, our data confirm that maximal effort higher order cytoreductive surgery resulting in complete cytoreduction can improve survival.</jats:p
Towards natural human-AI interactions in vision and language
Inter-human interaction is a rich form of communication. Human interactions typically leverage a good theory of mind, involve pragmatics, story-telling, humor, sarcasm, empathy, sympathy, etc. Recently, we have seen a tremendous increase in the frequency and the modalities through which humans interact with AI. Despite this, current human-AI interactions lack many of these features that characterize inter-human interactions. Towards the goal of developing AI that can interact with humans naturally (similar to other humans), I take a two-pronged approach that involves investigating the ways in which both the AI and the human can adapt to each other's characteristics and capabilities. In my research, I study aspects of human interactions, such as humor, story-telling, and the humans' abilities to understand and collaborate with an AI. Specifically, in the vision and language modalities,
1. In an effort to improve the AI's capabilities to adapt its interactions to a human, we build computational models for (i) humor manifested in static images, (ii) contextual, multi-modal humor, and (iii) temporal understanding of the elements of a story. 2. In an effort to improve the capabilities of a collaborative human-AI team, we study (i) a lay person's predictions regarding the behavior of an AI in a situation, (ii) the extent to which interpretable explanations from an AI can improve performance of a human-AI team. Through this work, I demonstrate that aspects of human interactions (such as certain forms of humor and story-telling) can be modeled with reasonable success using computational models that utilize neural networks. On the other hand, I also show that a lay person can successfully predict the outputs and failures of a deep neural network. Finally, I present evidence that suggests that a lay person who has access to interpretable explanations from the model, can collaborate more effectively with a neural network on a goal-driven task.Ph.D